Sequential Recurrent Neural Networks for Language Modeling
نویسندگان
چکیده
Feedforward Neural Network (FNN)-based language models estimate the probability of the next word based on the history of the last N words, whereas Recurrent Neural Networks (RNN) perform the same task based only on the last word and some context information that cycles in the network. This paper presents a novel approach, which bridges the gap between these two categories of networks. In particular, we propose an architecture which takes advantage of the explicit, sequential enumeration of the word history in FNN structure while enhancing each word representation at the projection layer through recurrent context information that evolves in the network. The context integration is performed using an additional word-dependent weight matrix that is also learned during the training. Extensive experiments conducted on the Penn Treebank (PTB) and the Large Text Compression Benchmark (LTCB) corpus showed a significant reduction of the perplexity when compared to state-of-the-art feedforward as well as recurrent neural network architectures.
منابع مشابه
Feedforward Sequential Memory Neural Networks without Recurrent Feedback
We introduce a new structure for memory neural networks, called feedforward sequential memory networks (FSMN), which can learn long-term dependency without using recurrent feedback. The proposed FSMN is a standard feedforward neural networks equipped with learnable sequential memory blocks in the hidden layers. In this work, we have applied FSMN to several language modeling (LM) tasks. Experime...
متن کاملCapturing Dependency Syntax with "Deep" Sequential Models
Neural network (“deep learning”) models are taking over machine learning approaches for language by storm. In particular, recurrent neural networks (RNNs), which are flexible non-markovian models of sequential data, were shown to be effective for a variety of language processing tasks. Somewhat surprisingly, these seemingly purely sequential models are very capable at modeling syntactic phenome...
متن کاملCompact Feedforward Sequential Memory Networks for Large Vocabulary Continuous Speech Recognition
In acoustic modeling for large vocabulary continuous speech recognition, it is essential to model long term dependency within speech signals. Usually, recurrent neural network (RNN) architectures, especially the long short term memory (LSTM) models, are the most popular choice. Recently, a novel architecture, namely feedforward sequential memory networks (FSMN), provides a non-recurrent archite...
متن کاملTensor-Train Recurrent Neural Networks for Video Classification
The Recurrent Neural Networks and their variants have shown promising performances in sequence modeling tasks such as Natural Language Processing. These models, however, turn out to be impractical and difficult to train when exposed to very high-dimensional inputs due to the large input-to-hidden weight matrix. This may have prevented RNNs’ large-scale application in tasks that involve very hig...
متن کاملContext-free and context-sensitive dynamics in recurrent neural networks
Continuous-valued recurrent neural networks can learn mechanisms for processing context-free languages. The dynamics of such networks is usually based on damped oscillation around fixed points in state space and requires that the dynamical components are arranged in certain ways. It is shown that qualitatively similar dynamics with similar constraints hold for abc, a context-sensitive language....
متن کاملQuasi-Recurrent Neural Networks
Recurrent neural networks are a powerful tool for modeling sequential data, but the dependence of each timestep’s computation on the previous timestep’s output limits parallelism and makes RNNs unwieldy for very long sequences. We introduce quasi-recurrent neural networks (QRNNs), an approach to neural sequence modeling that alternates convolutional layers, which apply in parallel across timest...
متن کامل